54 research outputs found

    Predicting whole genome protein interaction networks from primary sequence data in model and non-model organisms using ENTS

    Get PDF
    Background The large-scale identification of physical protein-protein interactions (PPIs) is an important step toward understanding how biological networks evolve and generate emergent phenotypes. However, experimental identification of PPIs is a laborious and error-prone process, and current methods of PPI prediction tend to be highly conservative or require large amounts of functional data that may not be available for newly-sequenced organisms. Results In this study we demonstrate a random-forest based technique, ENTS, for the computational prediction of protein-protein interactions based only on primary sequence data. Our approach is able to efficiently predict interactions on a whole-genome scale for any eukaryotic organism, using pairwise combinations of conserved domains and predicted subcellular localization of proteins as input features. We present the first predicted interactome for the forest tree Populus trichocarpa in addition to the predicted interactomes for Saccharomyces cerevisiae, Homo sapiens, Mus musculus, and Arabidopsis thaliana. Comparing our approach to other PPI predictors, we find that ENTS performs comparably to or better than a number of existing approaches, including several that utilize a variety of functional information for their predictions. We also find that the predicted interactions are biologically meaningful, as indicated by similarity in functional annotations and enrichment of co-expressed genes in public microarray datasets. Furthermore, we demonstrate some of the biological insights that can be gained from these predicted interaction networks. We show that the predicted interactions yield informative groupings of P. trichocarpa metabolic pathways, literature-supported associations among human disease states, and theory-supported insight into the evolutionary dynamics of duplicated genes in paleopolyploid plants. Conclusion We conclude that the ENTS classifier will be a valuable tool for the de novoannotation of genome sequences, providing initial clues about regulatory and metabolic network topology, and revealing relationships that are not immediately obvious from traditional homology-based annotations

    Genome-wide analysis of Aux/IAA and ARF gene families in Populus trichocarpa

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Auxin/Indole-3-Acetic Acid (Aux/IAA) and Auxin Response Factor (ARF) transcription factors are key regulators of auxin responses in plants. We identified the suites of genes in the two gene families in <it>Populus </it>and performed comparative genomic analysis with <it>Arabidopsis </it>and rice.</p> <p>Results</p> <p>A total of 35 <it>Aux/IAA </it>and 39 <it>ARF </it>genes were identified in the <it>Populus </it>genome. Comparative phylogenetic analysis revealed that several Aux/IAA and ARF subgroups have differentially expanded or contracted between the two dicotyledonous plants. Activator <it>ARF </it>genes were found to be two fold-overrepresented in the <it>Populus </it>genome. <it>PoptrIAA </it>and <it>PoptrARF </it>gene families appear to have expanded due to high segmental and low tandem duplication events. Furthermore, expression studies showed that genes in the expanded <it>PoptrIAA3 </it>subgroup display differential expression.</p> <p>Conclusion</p> <p>The present study examines the extent of conservation and divergence in the structure and evolution of <it>Populus Aux/IAA </it>and <it>ARF </it>gene families with respect to <it>Arabidopsis </it>and rice. The gene-family analysis reported here will be useful in conducting future functional genomics studies to understand how the molecular roles of these large gene families translate into a diversity of biologically meaningful auxin effects.</p

    Contrasting patterns of evolution following whole genome versus tandem duplication events in Populus

    Get PDF
    Comparative analysis of multiple angiosperm genomes has implicated gene duplication in the expansion and diversification of many gene families. However, empirical data and theory suggest that whole-genome and small-scale duplication events differ with respect to the types of genes preserved as duplicate pairs. We compared gene duplicates resulting from a recent whole genome duplication to a set of tandemly duplicated genes in the model forest tree Populus trichocarpa. We used a combination of microarray expression analyses of a diverse set of tissues and functional annotation to assess factors related to the preservation of duplicate genes of both types. Whole genome duplicates are 700 bp longer and are expressed in 20% more tissues than tandem duplicates. Furthermore, certain functional categories are over-represented in each class of duplicates. In particular, disease resistance genes and receptor-like kinases commonly occur in tandem but are significantly under-retained following whole genome duplication, while whole genome duplicate pairs are enriched for members of signal transduction cascades and transcription factors. The shape of the distribution of expression divergence for duplicated pairs suggests that nearly half of the whole genome duplicates have diverged in expression by a random degeneration process. The remaining pairs have more conserved gene expression than expected by chance, consistent with a role for selection under the constraints of gene balance. We hypothesize that duplicate gene preservation in Populus is driven by a combination of subfunctionalization of duplicate pairs and purifying selection favoring retention of genes encoding proteins with large numbers of interactions

    Efficiency of gene silencing in \u3ci\u3eArabidopsis\u3c/i\u3e: direct inverted repeats vs. transitive RNAi vectors

    Get PDF
    We investigated the efficiency of RNA interference (RNAi) in Arabidopsis using transitive and homologous inverted repeat (hIR) vectors. hIR constructs carry self-complementary intron-spliced fragments of the target gene whereas transitive vectors have the target sequence fragment adjacent to an intron-spliced, inverted repeat of heterologous origin. Both transitive and hIR constructs facilitated specific and heritable silencing in the three genes studied (AP1 , ETTIN and TTG1 ). Both types of vectors produced a phenotypic series that phenocopied reduction of function mutants for the respective target gene. The hIR yielded up to fourfold higher proportions of events with strongly manifested reduction of function phenotypes compared to transitive RNAi. We further investigated the efficiency and potential off-target effects of AP1 silencing by both types of vectors using genome-scale microarrays and quantitative RT-PCR. The depletion of AP1 transcripts coincided with reduction of function phenotypic changes among both hIR and transitive lines and also showed similar expression patterns among differentially regulated genes. We did not detect significant silencing directed against homologous potential off-target genes when constructs were designed with minimal sequence similarity. Both hIR and transitive methods are useful tools in plant biotechnology and genomics. The choice of vector will depend on specific objectives such as cloning throughput, number of events and degree of suppression required

    The obscure events contributing to the evolution of an incipient sex chromosome in Populus: a retrospective working hypothesis

    Get PDF
    Genetic determination of gender is a fundamental developmental and evolutionary process in plants. Although it appears that dioecy in [i]Populus[/i] is genetically controlled, the precise gender-determining systems remain unclear. The recently released second draft assembly and annotated gene set of the [i]Populus[/i] genome provided an opportunity to revisit this topic. We hypothesized that over evolutionary time, selective pressure has reformed the genome structure and gene composition in the peritelomeric region of the chromosome XIX, which has resulted in a distinctive genome structure and cluster of genes contributing to gender determination in [i]Populus trichocarpa[/i]. Multiple lines of evidence support this working hypothesis. First, the peritelomeric region of the chromosome XIX contains significantly fewer single nucleotide polymorphisms than the rest of [i]Populus[/i] genome and has a distinct evolutionary history. Second, the peritelomeric end of chromosome XIX contains the largest cluster of the nucleotide-binding site–leucine-rich repeat (NBS–LRR) class of disease resistance genes in the entire [i]Populus[/i] genome. Third, there is a high occurrence of small microRNAs on chromosome XIX, which is coincident to the region containing the putative gender-determining locus and the major cluster of NBS–LRR genes. Further, by analyzing the metabolomic profiles of floral bud in male and female [i]Populus[/i] trees using a gas chromatography-mass spectrometry, we found that there are gender-specific accumulations of phenolic glycosides. Taken together, these findings led to the hypothesis that resistance to and regulation of a floral pathogen and gender determination coevolved, and that these events triggered the emergence of a nascent sex chromosome. Further studies of chromosome XIX will provide new insights into the genetic control of gender determination in [i]Populus[/i]

    Genome resequencing reveals multiscale geographic structure and extensive linkage disequilibrium in the forest tree Populus trichocarpa

    Get PDF
    This is the publisher’s final pdf. The article is copyrighted by the New Phytologist Trust and published by John Wiley & Sons, Inc. It can be found at: http://onlinelibrary.wiley.com/journal/10.1111/%28ISSN%291469-8137. To the best of our knowledge, one or more authors of this paper were federal employees when contributing to this work.•Plant population genomics informs evolutionary biology, breeding, conservation and bioenergy feedstock development. For example, the detection of reliable phenotype–genotype associations and molecular signatures of selection requires a detailed knowledge about genome-wide patterns of allele frequency variation, linkage disequilibrium and recombination.\ud •We resequenced 16 genomes of the model tree Populus trichocarpa and genotyped 120 trees from 10 subpopulations using 29 213 single-nucleotide polymorphisms.\ud •Significant geographic differentiation was present at multiple spatial scales, and range-wide latitudinal allele frequency gradients were strikingly common across the genome. The decay of linkage disequilibrium with physical distance was slower than expected from previous studies in Populus, with r² dropping below 0.2 within 3–6 kb. Consistent with this, estimates of recent effective population size from linkage disequilibrium (N[subscript e] ≈ 4000–6000) were remarkably low relative to the large census sizes of P. trichocarpa stands. Fine-scale rates of recombination varied widely across the genome, but were largely predictable on the basis of DNA sequence and methylation features.\ud •Our results suggest that genetic drift has played a significant role in the recent evolutionary history of P. trichocarpa. Most importantly, the extensive linkage disequilibrium detected suggests that genome-wide association studies and genomic selection in undomesticated populations may be more feasible in Populus than previously assumed
    corecore